Accurate Prediction of Protein Structural Class
نویسندگان
چکیده
Because of the increasing gap between the data from sequencing and structural genomics, the accurate prediction of the structural class of a protein domain solely from the primary sequence has remained a challenging problem in structural biology. Traditional sequence-based predictors generally select several sequence features and then feed them directly into a classification program to identify the structural class. The current best sequence-based predictor achieved an overall accuracy of 74.1% when tested on a widely used, non-homologous benchmark dataset 25PDB. In the present work, we built a multiple linear regression (MLR) model to convert the 440-dimensional (440D) sequence feature vector extracted from the Position Specific Scoring Matrix (PSSM) of a protein domain to a 4-dimensinal (4D) structural feature vector, which could then be used to predict the four major structural classes. We performed 10-fold cross-validation and jackknife tests of the method on a large non-homologous dataset containing 8,244 domains distributed among the four major classes. The performance of our approach outperformed all of the existing sequence-based methods and had an overall accuracy of 83.1%, which is even higher than the results of those predicted secondary structure-based methods.
منابع مشابه
Accurate prediction of protein secondary structural class with fuzzy structural vectors.
The prerequisites for accurate prediction of protein secondary structural class (all-alpha, all-beta, alpha+beta, alpha/beta or multidomain) were studied, and a new similarity-based method is presented for the prediction of the secondary structural class of a protein from its sequence. The new method uses representatives of nuclear families as a learning set. For the sequence to be predicted, t...
متن کاملPSSP-RFE: Accurate Prediction of Protein Structural Class by Recursive Feature Extraction from PSI-BLAST Profile, Physical-Chemical Property and Functional Annotations
Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction becomes an increasingly challenging task. Amongst most homological-based approaches, the accuracies o...
متن کاملDevelopment and performance evaluation of FLANN based model for protein structural class prediction
Abstract— During last few decades’ accurate prediction of protein structural class has been a challenging problem. Efficient and meaningful representation of protein molecule plays a significant role. In this paper Chou’s pseudo amino acid composition along with amphiphillic correlation factor has been used to represent protein data. A simple functionally linked artificial neural network has be...
متن کاملProtein Structural Class Prediction Using Differential Evolution
protein structural class prediction has been a challenging problem in protein science for many years. In this paper we present a new optimization approach using the Differential evolution (DE) for predicting the protein structural class. It uses the maximum component coefficient principle in association with the amino acid composition feature vector to efficiently classify the protein domains. ...
متن کاملNew methods for accurate prediction of protein secondary structure.
A primary and a secondary neural network are applied to secondary structure and structural class prediction for a database of 681 non-homologous protein chains. A new method of decoding the outputs of the secondary structure prediction network is used to produce an estimate of the probability of finding each type of secondary structure at every position in the sequence. In addition to providing...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 7 شماره
صفحات -
تاریخ انتشار 2012